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We present an improved draft genome sequence for Clostridium pasteurianum strain ATCC 6013 (DSM 525), the type strain of 
the species and an important solventogenic bacterium with industrial potential. Availability of a near-complete genome se- 
quence will enable strain engineering of this promising bacterium. 



Received 16 July 2014 Accepted 21 July 2014 Published 7 August 2014 

Citation Pyne ME, Utturkar S, Brown SD, Moo-Young M, Chung DA, Chou CP. 2014. Improved draft genome sequence of Clostridium pasteurianum strain ATCC 6013 {DSM 525) 

using a hybrid next -generation sequencing approach. Genome Announc. 2(4):e00790-14. doi:10.1 128/genomeA.00790-14. 

Copyright © 2014 Pyne et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported license. 

Address correspondence to Duane A. Chung, duane.chung@uwaterloo.ca, or C. Perry Chou, cpchou@uwaterloo.ca. 



Clostridium pasteurianum is a mesophilic, anaerobic Gram- 
positive bacterium that is apathogenic, can be easily cultivated 
in chemically defined media, and is more aerotolerant than many 
Clostridia (1, 2). Historically, C. pasteurianum has been utilized 
extensively as a model organism for the study of nitrogen fixation 
(3) and clostridial ferredoxins (4). More recently, C. pasteurianum 
has received significant biotechnological attention because of its 
capacity to ferment waste glycerol (5) and glycerol-rich thin still- 
age (6), which are major by-products of biodiesel and bioethanol 
production, respectively. In addition to acids and carbon dioxide, 
glycerol is converted to appreciable quantities of butanol, 1,3- 
propanediol, and hydrogen gas (7), which have industrial poten- 
tial as chemicals or biofuels. To allow genetic manipulation of this 
organism, several recent studies have outlined the need for a ge- 
nome sequence of C. pasteurianum (5, 8). A concurrent effort has 
reported a draft genome sequence for the type strain (9), while 
partial or full genome sequences are also available for two 
C. pasteurianum isolates (BC1 [http://www.ncbi.nlm.nih.gov/ 
GenBank/] and NRRL B-598 [ 10] ), further demonstrating the ap- 
peal of this species. Here we report an improved draft genome 
assembly for the type strain of C. pasteurianum, which was gener- 
ated using a hybrid next-generation sequencing approach. 

The genome of C. pasteurianum ATCC 6013 was sequenced 
using 454, Illumina MiSeq, and single-molecule real-time 
(SMRT) RS I and RS II sequencing platforms with sequence cov- 
erages of 20 X, 335 X, 80 X, and 90 X, respectively. De novo ge- 
nome assembly of PacBio data was performed using HGAP. 1 pro- 
tocol from SMRT Analysis software version 2.1. The resulting 
draft genome sequence of C. pasteurianum ATCC 6013 comprises 
4,420,124 bp and 12 contigs, an improvement on the 37 contigs 
reported previously (9). The N50 contig size was improved from 
229 kb to 859 kb. Geneious software (Biomatters Ltd., Auckland, 
New Zealand) identified two supercontigs, with putative contig 
orderings of ctgl0-ctglC-ctg5-ctg3C-ctg9-ctgl2-ctg4-ctgll and 
Ctg6-ctg7-ctg2-ctg8, respectively (C= complement of contig). 
Mapping of Illumina and 454 reads against a PacBio assembly 



detected only 5 SNPs, which were corrected and defines the high 
quality of assembly resulting from PacBio reads. 

Genome annotation was performed using the Oak Ridge Na- 
tional Laboratory annotation pipeline, based on the Prodigal gene 
prediction algorithm (11). Ribosomal RNAs were annotated using 
RNAmmer 1.2 (12), and transfer RNAs were predicted using 
tRNAscan-SE (13). The G+C content of the genome is 30%. Ap- 
proximately 4,047 protein-coding, 95 tRNA, and 29 rRNA (14 X 
16 S and 15 X 23 S) genes were predicted in the genome. No 
putative extrachromosomal elements were identified. Gene se- 
quences for enzymes involved in the production of primary me- 
tabolites, including ethanol, butanol, and 1,3 -propanediol, are 
present in the genome sequence, many of which are organized in 
operons similar to C. acetobutylicum (14). An acetone formation 
locus (adhE-ctfAB-adc) possessing significant sequence similarity 
to that of C. acetobutylicum (15) is encoded in the genome, yet, as 
indicated previously (9), acetone production by C. pasteurianum 
has not been reported. It is expected that the draft genome se- 
quence presented herein will guide future strain improvement 
efforts involving C. pasteurianum (16). 

Nucleotide sequence accession numbers. This Whole Ge- 
nome Shotgun project has been deposited at DDBJ/EMBL/Gen- 
Bank under the accession JPGY00000000. The version described 
in this report is version JPGY01000000. 
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